228 PART 5 Looking for Relationships with Correlation and Regression

»

» An r 2 value of 0 means that your data points are all over the place, with no

tendency at all for the X and Y variables to be associated.»

» An r 2 value of 0.3 (as in this example) means that 30 percent of the variance in

the dependent variable is explainable by the independent variable in this

straight-line model.

Note: Figure 18-4 also lists the Adjusted R-squared at the bottom right. We talk

about the adjusted r 2 value in Chapter 17 when we explain multiple regression, so

for now, you can just ignore it.

The F statistic

The last line of the sample output in Figure 17-4 presents the F statistic and asso-

ciated p value (under F-statistic). These estimates address this question: Is the

straight-line model any good at all? In other words, how much better is the

straight-line model, which contains an intercept and a predictor variable, at pre-

dicting the outcome compared to the null model?

The null model is a model that contains only a single parameter representing a

constant term with no predictor variables at all. In this case, the null model would

only include the intercept.

Under α = 0.05, if the p value associated with the F statistic is less than 0.05, then

adding the predictor variable to the model makes it statistically significantly bet-

ter at predicting SBP than the null model.

For this example, the p value of the F statistic is 0.013, which is statistically sig-

nificant. It means using weight as a predictor of SBP is statistically significantly

better than just guessing that everyone in the data set has the mean SBP (which is

how the null model is compared).

Scientific fortune-telling with

the prediction formula

As we describe in Chapter 15, one reason to do regression in biostatistics is to

develop a prediction formula that allows you to make an educated guess about

value of a dependent variable if you know the values of the independent variables.

You are essentially developing a predictive model.

Some statistics programs show the actual equation of the best-fitting straight

line. If yours doesn’t, don’t worry. Just substitute the coefficients of the intercept

and slope for a and b in the straight-line equation: Y

a

bX.